Colorful Triangle Counting and a MapReduce Implementation

نویسندگان

  • Rasmus Pagh
  • Charalampos E. Tsourakakis
چکیده

In this note we introduce a new randomized algorithm for counting triangles in graphs. We show that under mild conditions, the estimate of our algorithm is strongly concentrated around the true number of triangles. Specifically, if p ≥ max ( log n t , log n √ t ), where n, t, ∆ denote the number of vertices in G, the number of triangles in G, the maximum number of triangles an edge of G is contained, then for any constant > 0 our unbiased estimate T is concentrated around its expectation, i.e., Pr [|T − E [T ] | ≥ E [T ]] = o(1). Finally, we present a MapReduce implementation of our algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Triangle Counting

Triangle counting is an important problem in graph mining. Clustering coefficients of vertices and the transitivity ratio of the graph are two metrics often used in complex network analysis. Furthermore, triangles have been used successfully in several real-world applications. However, exact triangle counting is an expensive computation. In this paper we present the analysis of a practical samp...

متن کامل

A Review on Large Scale Graph Processing Using Big Data Based Parallel Programming Models

Processing big graphs has become an increasingly essential activity in various fields like engineering, business intelligence and computer science. Social networks and search engines usually generate large graphs which demands sophisticated techniques for social network analysis and web structure mining. Latest trends in graph processing tend towards using Big Data platforms for parallel graph ...

متن کامل

MapReduce vs. Pipelining Counting Triangles

In this paper we follow an alternative approach named pipeline, to implement a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To be concrete, we implement a dynamic pipeline of processes and an ad-hoc MapReduce version using the language Go....

متن کامل

Comparing MapReduce and Pipeline Implementations for Counting Triangles

A common method to define a parallel solution for a computational problem consists in finding a way to use the Divide & Conquer paradigm in order to have processors acting on its own data and scheduled in a parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the...

متن کامل

Counting Triangles in Massive Graphs with MapReduce

Graphs and networks are used to model interactions in a variety of contexts. There is a growing need to quickly assess the characteristics of a graph in order to understand its underlying structure. Some of the most useful metrics are triangle-based and give a measure of the connectedness of mutual friends. This is often summarized in terms of clustering coefficients, which measure the likeliho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Process. Lett.

دوره 112  شماره 

صفحات  -

تاریخ انتشار 2012